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We have often observed unexpected state transitions of complex systems. We are thus interested in how to 
steer a complex system fi'om an unexpected state to a desired state. Here we introduce the concept of 
transittability of complex networks, and derive a new sufficient and necessary condition for state 
transittability which can be efficiently verified. We define the steering kernel as a minimal set of steering 
nodes to which control signals must directly be applied for transition between two specific states of a 
network, and propose a graph-theoretic algorithm to identify the steering kernel of a network for transition 
between two specific states. We applied our algorithm to 27 real complex networks, finding that sizes of 
steering kernels required for transittability are much less than those for complete controllability. 
Furthermore, applications to regulatory biomolecular networks not only validated our method but also 
identified the steering kernel for their phenotype transitions. 

Many complex systems of scientific interest can be represented as directed networks in which a set of nodes 
are connected in pairs by directed edges or arcs' Because of the interactions among nodes in a 
network, perturbing some nodes can affect other nodes, which may cause the state transition of a 
network. In reality, we have often observed some unexpected state transitions of a complex system (for example, 
from a normal sate to an abnormal state)" '"'. Here we are interested in how to effectively steer the system from an 
unexpected state to a desired state by applying suitable input control signals. The main purpose of this work is to 
provide a theoretical framework that addresses such an issue for complex networks, especially, regulatory 
biomolecular networks. 

A regulatory biomolecular network is orchestrated by the interactions of many molecules in a cell" ". A living 
cell should stay at a normal (at least, healthy) phenotype. However, by some unknown perturbation or stimuli, a 
regulatory biomolecular network can be transited from a normal phenotype to a disease phenotype. It thus is 
desirable to steer the regulatory network to transit from the abnormal phenotype to a healthy phenotype. To study 
the phenotype transitions, a regulatory biomolecular network is represented by a directed network in which 
molecules are represented by nodes and the interactions between molecules are represented by arcs^ '" '^ As a 
result, cellular phenotypes can be defined by the network states that represent all the molecular expressions in the 
network collectively while a phenotypic change or cellular behavior change can be described as a dynamic 
transition between two states of the network, such as a complex disease progression"-'^, p53-mediated DNA 
damage response network'*, T helper cells differentiation", and epithelial to mesenchymal transition""". 

The empirical studies in the cellular reprogramming field have indicated that one phenotype can be transited to 
another by overexpressing a few transcription factors"""^* (steering nodes). In the field of the network-based 
methodologies for drug designs^'", trial-and-error-based methods according to the researcher's experiences 
have found that a few drug targets are enough to achieve a transition from a disease state to a healthy state for 
many complex diseases'^ '^. For example, acute promyelocytic leukemia (APL), a subtype of acute myeloid 
leukemia (AML), has been successfully treated with therapy which utilizes aU-trans-retinoic acid (ATRA). 
However, among patients with non-APL AML, ATRA-based treatment has not been effective. Based on the 
literature and experimental verifications, Schenk et aP' have concluded that ATRA plus tranylcypromine (TCP) 
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can effectively treat non-APL AML. From the viewpoint of dynamic 
systems control theory, the cellular network in charge of APL AML 
cell line can be transited from the abnormal state to a healthy state 
through the targets of ATRA while the cellular network in charge of 
non-APL AML cell line cannot be transited from the abnormal state 
to a healthy state through only the targets of ATRA. However, cur- 
rent researches in these fields have little or no control theory involved 
although control theory has been successfully applied to study the 
state transition of engineering systems. 

Although the concept of controllability of dynamic linear sys- 
tems^" can be applied to complex networks""", most of their para- 
meters are either unknown or known only approximately and are 
time dependent'. In addition, even if all the parameters are available, 
the determination of controllability is computationally prohibitive 
even for moderate-size networks. Nevertheless, Liu, et aF have 
recently applied the concept of structural controllability'""^^ to study 
the controllability of directed complex networks, and derived the 
theoretical result of complete controllability, i.e., the transition 
between any two states of a network (rather than two specific states). 
According to their result, the minimum number of driver nodes is 
80% of nodes in a regulatory biomolecular network in order to have 
complete controllability, which seems to contradict some empirical 
findings in cellular reprogramming field"". In order to reduce the 
number of driver nodes, Nepusz and Vicsek** have studied the com- 
plete controllability of complex networks in terms of edge dynamics 
instead of node dynamics. In addition, with a strong assumption that 
each driver node can control its outgoing edges independently'"', 
Nacher and Akutsu have studied the complete controllability of 
bipartite networks". In fact, a phenotype can be considered as a 
high-dimensional attractor of the complex network'*. The transition 
between two specific phenotypes (rather than any two phenotypes) 
stems from the change of states of some (not all) nodes in a subspace 
of the fuU state space'"". In addition, complete controllability often 
requires more steering nodes and affects the fuU state space in the 
network (Figure lb). Therefore, it is asking for too much to have 
complete controllability in studying the transition between two spe- 
cific states. 

Differently from complete controllability between any two states, 
here we aim to develop a theoretic framework for studying transi- 
tions between two specific states of directed complex networks by 
introducing a new concept of transittabUity, and further apply our 
theoretical results for identifying the steering kernels of 27 real com- 
plex networks and 4 biomolecular networks. Here, "real complex 
networks" mean that the networks are constructed based on math- 
ematical models of real systems. Specifically, we first define the con- 
cepts of transittability of a complex directed network and develop a 
new sufficient and necessary condition for transittability under 
which a specific structural state of a complex network can be trans- 
ited to another. Our new condition can be efficiently verified by a 
graph-theoretic algorithm. We call a node on which an input control 
signal is directly acted as a steering node. We then define the steering 
kernel as a minimal set of steering nodes to steer the network to 
transit from one state to another. Here we stress that the steering 
kernel is different from the minimum set of driver nodes in the 
paper'. As illustrated in Supplementary Information I, we show that 
a network cannot be guaranteed to transit from one specific state to 
any other state by acting only on the minimum set of driver nodes 
identified by the minimum input theorem', whereas the steering 
kernel defined in this work can ensure such a transition. Further- 
more, we develop a graph-theoretic algorithm to identify the steering 
kernel for transition between two specific states. We apply our algo- 
rithm to 27 real complex networks, finding that the minimum num- 
bers of steering nodes required for transittability are much less than 
those for complete controllability. In addition, we also apply our 
algorithms to several real biomolecular networks, finding that not 
only is the number of the identified steering nodes for cellular 



phenotype transitions small, but also the identified steering nodes 
are consistent with empirical findings in the literature. 

Results 

Transittability. Although most complex dynamic systems are non- 
linear, the controllability of nonlinear systems is in many aspects 
structurally similar to that of linear systems''""'". Actually, to ultima- 
tely develop the control strategies for complex nonlinear networks, a 
necessary and fundamental step is to investigate the controllability 
(especially structural controllability) of complex networks with 
linear dynamics. In this study, we thus consider the linear time- 
invariant nodal dynamics of a complex network with n nodes, 
where the activity of node i, Xi{t), can be described by the following 
equations 

n 

Xi(t)=^aijXj(t) + aiUi(t) fori=l,2, ...,n (l) 

J = i 

where 0 if node j directly affects node i, that is, there is an arc 
from node j to node i in the network, and otherwise = 0. o", = 1 if 
input control signal u,(f) directly acts on node i and otherwise o", = 0. 
In this study, we are interested not in the complete controllability', 
but in the transittability of the system (1), which concerns the 
transition between two specific states by a suitable choice of input 
control signals (Figure Ic). Formally, the system (1) is said to be 
transittable between two given specific states Xg and Xi if it can be 
transited between Xq and Xi in finite time tjrhy proper input control 
signals u{t) {te[0,tf]). Note that the system (1) is transittable between 
any two states by simply acting one independent input control signal 
on each of n nodes. That is, all nodes are steering nodes, then we have 

n 

(Tj = 1 for i = 1,2,. . .,n and thus o", = n. However in this study we 

1 = 1 

are interested in finding the minimum set of steering nodes (called 
steering kernel) to achieve the state transition between two specific 

states, in other worlds, minimizing Ci while the system (1) is 

1 = 1 

transittable between two given specific states Xo and Xj. 

The system (1) can be rewritten in the vector-matrix format as 
follows 

x(t) = Ax{t) + Bu{t) (2) 

where the n-dimensional vector x(0 = {xi{t), x„(t))'^ represents 
the state of the network with n nodes at time t. The nX n matrix A = 
(ajj) describes the interaction relationship and strength between 
nodes. The n X p matrix B is called the input control matrix that 
corresponds to the steering nodes. The p-dimensional vector u(t) = 
(mi(0. Up(t)y represents the input control signals. As in many 
situations'"'^'"^'', one controller cannot produce multiple independ- 
ent input control signals. Here we assume that one controller can 
produce only one independent input control signal. Therefore, all 
elements in the j-th column of matrix B are all zeroes except for the s- 
th element if the j-th input control signal directly acts on node s. Our 
theoretical result (Supplementary Information Section II) shows that 
the system (2) is transittable between states Xq and Xj if and only if 
there exists a positive number tf such that 

e^'f Xo - Xiespan{C) (3) 



C^[B,AB,...,A"-^B] (4) 

where span{C) represents the subspace spanned by the column vec- 
tors of matrfac C and is called the controllable subspace. Now finding 
the steering kernel to steer the system (2) from state Xg to Xi can be 
formulated as a problem to find the matrix B with the minimum 
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Figure 1 | Illustration of network transittability. (a) Considering the transittability of the network between two specific states Xq and Xi as shown here and 
finding the steering kernel for such a state transition. Nodes with the same color at two states (e.g., v^, Vg) indicates that they are unchanged while nodes 
with different colors (e.g., V2, Vi, V3) indicates that they are unchanged at two states. "*" in the structure matrix A of the network represents the free 
parameters while "0" represents the fixed parameters. "*" in two states Xq and Xj represents the value of corresponding nodes different while "0" 
represents the value of corresponding nodes indifferent at two states, (b) From the concept of complete controllability, two input control signals should be 
directly applied to two steering nodes Vf, and V5 to transit the network between any two states, including two specific states Xo and Xi . One can see that ( 1 ) 
more steering nodes than necessary are needed and (2) the full state space with six dimensions is affected for such a state transition, which may cause side 
effects, (c) From our new concept of transittability and new theorems, only one steering node V3 is needed for the transition between two specific states Xq 
and Xj. (d) This is the traditional sufficient and necessary condition for transittability of two specific states Xq and Xi which is actually intractable, (e) Using 
our new sufficient and necessary condition for transittability of two specific states Xo and Xi, we can identify the steering kernel by solving an optimal 
assignment of weighted bipartite graphs via an efficient graph-theoretic algorithm. 
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number of columns such that condition (3) is true. However, the 
calculation of either e^*' is complicated and condition (3) is compu- 
tationally intractable (Figure Id). Next, we state our first key result, 
i.e., we prove (see Supplementary Information Section II) that the 
system (2) is transittable between two specific states Xg and Xi with 
either belonging to span(C) if and only if 

rank{C) = rank{C) (5) 



C=[B,A£,...,A"-'B] (6) 

where B = [x(,— Xi ,B] . The transittability is typically considered 
between two stable states or one stable state and another state. Let 
us say that is a stable state, we can always assume that Xi = 0 (the 
origin) without loss of generality, i.e., we can replace Xg with Xg - Xj if 
Xi 7^ 0, which does not affect the result. Then we have that the system 
(2) is transittable between a specific states Xq and the origin if and 
only if 

rank(Co) = rank(C) (7) 



Co = [Bo,ABo,...,A"-iBo] (8) 

where Bq = [xq- S\. Although the condition (7) is much easier to be 
verified than the condition (3), the calculation of either rank{Co) or 
rank{Co) is stiU prohibitive because of the large size n of a complex 
network, the uncertainty, time dependence of the entries in matrices 
A and/or vector Xq. Note that the transittability is not only to control 
a state to the origin, but also to control a state from the origin to other 
specific states. 

Identifying the steering kernel. To overcome the computational 
impedance in verifying condition (7), we further introduce the 
structural transittability via the concepts of structural matrix and 
generic dimension of controllable subspace. M is said to be a 
structural matrix if its entries are either fixed zeros or independent 
free parameters. M is called admissible (with respect to M) if it can be 
obtained by fixing the free parameters of M at some specific values. If 
A and B are structural matrices, system (2) is called a structural 
system and is denoted by (A, B). Associated with a structural 
system (A, B), a directed graph G(A, B) = (V, E) can be defined 
with the set of nodes V = V^UVg, where = {xi,...,x-^: = {v^,..., 
v„} is the set of state vertices, corresponding to the n state compo- 
nents while Vb = {«!,..., Up}: = {Vn+i,..., Vn+p} is the set of input 
vertices, corresponding to the p inputs, and the set of arcs E = 
Ej^UEg, where E^ = {{Xy x^) = (Vj, Vj) | ajj 0} is the set of 
directed edges between state vertices while Eb = {(mj, = (Vn+j, 
Vi) I bij 7^ 0} is the set of directed edges between input vertices and 
state vertices. We can also define a directed network G(A) = (V^, E^) 
with respect to a structural matrix A (Figure la). In a directed graph, 
an elementary path is a sequence of arcs {(v,o— >v,i),(v,i— >v,2), ■ ■ ■ , 
{vik-i^'^ik}} where all vertices {v,o> V;^-} are different, and 

when V;o = v,vi, it is called an elementary cycle. A stem is an 
elementary path originating from an input vertex in Vg. 

A structural system (A, B) is reducible if there exists a permutation 
matrix P such that 







A}' 




Br 




, PB = 






0 






0 



with Alei^"'^"^A2e-R"^''"^ and B2ei?"^>"', 1 < «i < «andni + 
= n. Otherwise (A, B) is said to be irreducible. 

The dimension of the controllable subspace of structural system 
(A, B) varies as a function of free parameters in structural matrices A 
and B. That is, for different admissible systems (A,B), the dimensions 



of their controllable subspaces might be different. As the maximum 
rank of matrix C is at most n, the dimension of controllable subspace 
of structural system (A, B) can reach the maximum value. We define 
this maximum value as the generic dimension of the controllable 
subspace of structure system (A, B) and denote it by GDCS(A,B). 
The GDCS(A,B) is a generic property"'"'''"' in the sense that for 
almost all admissible systems (A,B) (with respect to (A, B)) the 
dimension of their controllable subspaces takes a constant which is 
GDCS(A,B). Hosoe has proved"* that if (A, B) is irreducible. Then 

GDCS(A,B)= max{|£(G)|} (10) 

where G* denotes the set of subgraphs of G(A, B), which is defined as 

J G^G(A,B)|G consists of elementary cycles and at most p stems in G(A,B).| 
( The elementary cycles and stems have no node in common. J 

j£(G)| denotes the number of edges in G. Applying equations (7) to 
the structural system (A,B), identifying the steering kernel by which 
the network G(A) can be transited between a state Xq and the origin 
becomes finding a structural control matrix B with the minimum 
number of columns such that 

GDCS(A,B) = GDCS(A,Bo). (11) 

Our second key result is that we develop a graph-theoretic algo- 
rithm''-" to identify the steering kernel by solving an optimal assign- 
ment problem of a weighted bipartite graph (Figure le). For details, 
see the Materials and Methods and the Supplementary Information 
III. 

TransittabUity of complex networks. We apply our algorithm to 27 
complex networks to determine their steering kernels and the results 
are summarized in Table 1. These 27 networks are a portion of 38 
complex networks in ref 7,8 and the number of their nodes ranges 
from 32 to 27772 while the number of edges ranges from 96 to 
352807. The phenotypes of complex networks are typically defined 
by a small portion of nodes*"'". For examples, the number of 
molecules (such as genes or proteins) significantly involved in a 
specific human disease is only a small portion of all molecules in a 
network'*'"' " ". Therefore, this study assumes that a transition 
between two specific states has 20% or 50% of nodes whose state 
values are changed, that is, Xq in (7) and (8) has 20% or 50% of 
nonzero elements. The fraction of steering nodes is defined as the 
ratio of the size of steering kernel to the total number of nodes in the 
networks. Columns 5-8 in Table 1 are the average results of 1000 
randomly defined transitions of each network. Columns 7 and 
8 respectively list the average fraction of steering nodes for 
transittability of 20% and 50% of nodes which are differently 
expressed at two states while Column 9 lists the fraction of driver 
nodes from Liu et aF. Comparing Columns 7 and 8 to Column 9 
concludes that the minimum numbers of steering nodes required for 
transittability is much less than those for complete controllability. 
For complete controllability, the generic dimension of controllable 
space is the number of nodes in the networks listed in Column 3. 
Columns 5 and 6 respectively list the average generic dimension of 
controllable space for transittability of 20% and 50% of nodes which 
are differently expressed at two states. Comparing Columns 5 and 6 
to Column 3 concludes that the controllable spaces for transittability 
are much smaller than those for complete controllability. 

Applications to regulatory biomolecular networks. We employ 
four different biological systems with different phenotypes in order 
to demonstrate the applicability of our method, as well as validate our 
theoretical results. These four examples are p53-mediated DNA 
damage response network'* (three phenotypes), T helper differentia- 
tion cellular network" (three phenotypes), yeast cell cycle network'*'' 
(three phenotypes), and epithelial to mesenchymal transition 
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Table 1 | Comparison of transittability and complete controllability of complex networks 
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Notations ore as follows: The overage generic dimension of controllable space (GDCS^°''° and GDCS^°°''°) for 20% and 50% of nodes cfianged between two specific states, respectively; the overage fractions 
of steering nodes (n^°^ and n™^) from our methods for 20% and 50% of nodes changed between two specific states, respectively; and fraction of driver nodes (fij^'") from Liu's poper^ 



network (two phenotypes)"'". Table 2 shows the identified steering 
kernels for transition between two phenotypes. 

p53-mediated DNA damage response network. This network, 
consisting of 17 molecules and 40 interactions as shown in 
Figure 2 and Table SI, responds to cell stresses such as DNA 
damage and can stay at three phenotypes"*. If there is no DNA 
damage, the ATM is inactive (ATM2). The level of phosphorylated 
monomer (ATM*) is low, then the p53 remains inactive. The DNA 
damage can lead to two different cellular phenotypes: cell cycle arrest 
and apoptosis'". At the cell cycle arrest phenotype, ATM is activated 
by DNA damage through auto-phosphorylation and transited from 
inactive dimer (ATM2) to ATM*. Subsequently, p53 is activated by 
ATM* and transited to p53*(p53 arrester). The expression levels 
of molecules represented by those green nodes in Figure 2 are 
oscillating'". The p21 is the product of this state which induces cell 
arrest. In total, the expression values of 9 nodes are significantly 
changed when the normal phenotype is transited to the arrest 
phenotype. At the apoptosis phenotype, ATM* still activates p53 
to p53*, but most p53* are in form of p53 Idller. P53AIP1 is 



activated by p53 killer and finally activates Casp3 which induces 
cell apoptosis. At this state, PTEN contributes to full activation of 
p53. In total, the expression values of 12 nodes are significandy 
changed when the normal phenotype is transited to the apoptosis 
phenotype. In addition, comparing the arrest phenotype and the 
apoptosis phenotype, the expression values of all 17 nodes are 
significantly changed. Applying our methods to this network yields 
the steering kernel consisting of PTEN and p53DINPl for the 
transition between normal and apoptosis phenotypes; the steering 
kernel consisting of Wipl and p53DINPl for the transition between 
normal and cell cycle arrest phenotypes; and the steering kernel 
consisting of Wipl, PTEN and p53DINPl for the transition 
between apoptosis and cell cycle arrest phenotypes. These results 
are in great agreement with Zhang et al's results'* where PTEN 
and Wipl are identified as key players for transitions of different 
states. On the other hand, if the complete controllability^ is applied to 
this network (Figure S5), the minimum number of driver nodes is 3 
while the driver nodes are not unique. For example, one minimum 
set of driver nodes consists of Wip 1 , PTEN and p53DINPl, which are 
the same as the steering nodes for transition between apoptosis and 



Table 2 | The number of molecules. 
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Figure 2 | P53-mediated cell damage response network and its phenotype transitions. A green- filled node means the level of this node oscillating at that 
state, a red-filled node means high level while an empty node means low level. The steering kernels are labelled for different transitions. 



cell cycle arrest phenotypes. However, this set of nodes for other two 
transitions is redundant. Furthermore, the complete control strategy 
with these three driver nodes wUl affect the full state space during 
the phenotype transition as shown in Table 2, which is clearly 
undesirable in practice (see Discussion). 

T helper differentiation cellular network. T helper cells (Th cells) 
are a sub-group of lymphocytes, a type of white blood cells, which 
play an important role in the immune system, particularly in the 
adaptive immune system. They help the activity of other immune 
cells by releasing T cell cytokines'" Matured Th cells express the 
surface protein CD4 and are referred to as CD4+T cells which can be 
classified as ThO (precursor), Thl and Th2 (effector) cells. Previously 
published experiments'*'''*' suggest that T-bet and GATA3 can induce 
both transitions from ThO to THl and from ThO to Th2. To deeply 
understand the mechanism of transitions among these phenotypes, 
Mendoza''^ constructs a core network in charge of the differentiation 
of Th cells, which contains 17 nodes with 27 interactions as shown in 
Figure 3 and Table S2. Comparing among these three phenotypes, 
one can see that 5, 4, and 9 molecules are significantly differentially 
expressed, between ThO and Thl phenotypes, between ThO and Th2 
phenotypes, and between Thl and Th2 phenotypes, respectively. 
Applying our methods to the T helper differential cellular 
network", we identify the steering nodes SOCSl and T-bet for the 



transition between ThO and Thl and the steering nodes IL-4 and 
GAT A3 for the transition between ThO and Th2, which is in 
agreement with existing results''^''''. We also identify the steering 
nodes T-bet and GATA3 for the transition between Thl and Th2, 
which is completely in agreement with the experimental data'"-''^. 
However, if the complete controllability^ is applied to this network 
(see Figure S6), the minimum number of driver nodes is three and the 
three driver nodes are IL-12, IL-18 and IFN-p. Actually, without any 
one of these three nodes, this network cannot be completely 
controlled. Although it has been reported that IL-12 and IL-18 
together can make the transition from ThO to Thl, this complete 
control strategy will affect the full state space during the transition as 
shown in Table 2, which is undesirable in practice (see Discussion). 

Yeast cell cycle network. The ceU-cycle process is a vital biological 
process by which one cell grows to divide into two daughter cells. To 
study this process, Li, et aP have established a molecular network 
consisting of 11 essential molecules with 34 interactions as shown in 
Figure 4 and Table S3. Applying the logic-like operations and using 
the exhaustive search, they have found seven stationary states 
(attractors), each corresponding a stable phenotype. The attractor 
with the largest basin size corresponds to the Gl stationary state of 
the cell (denoted by phenotype 1). The next two largest attractors 
may represent some common disorder states of the cell (denoted by 
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Figure 3 | T helper differentiation cellular network and its phenotype transitions. A red-filled node means high level expression and an empty node 
means low level expression. The steering kernels are labelled for different transitions. 



phenotypes 2 and 3). As other four attractors have small basin sizes, 
we do not consider them in this study. Comparing among these three 
phenotypes, we can see that 4, 1, and 5 nodes are significantly diffe- 
rentially expressed, between phenotypes 1 and 2, between pheno- 
types 1 and 3, and between phenotypes 2 and 3, respectively. On the 
one hand, the result by applying complete controllability' to this 
network is that the minimum number of driver nodes is 1 which 
could be Cln3, MBF or SBF (Figure S7). However, via either MBF or 
SBF this network cannot be completely controlled as either of them 
does not regulate node Cln3 from Figure 4. Although via Cln3 the 
network can be completely controlled, the full state space wUl be 
affected as shown in Table 2, which is undesirable (see Discus- 
sion). On the other hand, we apply our methods to this network 
for studying the transitions among those three phenotypes. We 
found a single steering node MBF for the transition between 
phenotypes 1 and 3, and a single steering node SBF for both the 
transition between phenotypes 1 and 2 and the transition between 
phenotypes 2 and 3, which suggests that SFB and MBF play an 
important role for the transitions among these three phenotypes in 
the cell-cycle process. 

Epithelial to Mesenchymal Transition (EMT) network. EMT is a 

phenomenon that cells change their genetic and transcriptional 
program leading to the alteration of phenotypes and functions. 



This change starts the metastatic dissemination which causes most 
human cancer deaths^". To study EMT, Moes et aP" have constructed 
an EMT network consisting of 6 nodes and 15 interactions as shown 
in Figure 5 and Table S4. The expressions of MIR203, MIR200 and 
CDHl are high and ZEB2, SNAIl and ZEBl are low at the epithelial 
phenotype while all are reversed at mesenchymal phenotype. There- 
fore, for this network, all 6 nodes are significantly differentially 
expressed between epithelial and mesenchymal phenotypes. Apply- 
ing our algorithm, we can identify node SNAIl as the steering node 
for the transition of these two phenotypes, which is completely in 
agreement with the experimental result verified by Moes et aP° that 
SNAIl can activate the transition from epithelial to mesenchymal 
phenotype. Actually, by applying our algorithm, we can identify 
anyone of all nodes except for CDHl as the steering node for the 
transition of these two phenotypes. From the other recent litera- 
ture™'*', MIR203 and MIR200 can also induce the transitions while 
leaving ZEB2 and ZEBl deserving the further investigation about 
their function for the transition between these two phenotypes. In 
fact, controlling the transition between these two phenotypes is 
complete control of the network. When the minimum input 
theorem' is applied to this network (Figure S8), anyone of six 
nodes could be the driver node to steer the network from one 
phenotype to any other phenotype. However, acting input control 
signals on CDHl cannot make the transition between these two 
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Figure 4 | Yeast cellular network and its phenotype transitions. A red-filled node means high level expression while an empty node means low level 
expression. The steering kernels are labelled for different transitions. 



phenotypes as CDHl does not regulate any nodes in the network 
from Figure 5. 

Discussion 

Transittability is at the heart of understanding state transitions of 
complex dynamic systems, especially cellular processes such as the 
cellular reprogramming and genetic disorder progressions. Besides 
the empirical studies"'""^'', recently control theory for dynamical sys- 
tems^"'^" has been applied to complex systems. As indicated in dis- 




Epithelial < i Mesenchymal 



Figure 5 | EiVIT network and its phenotype transitions. A red-filled node 
means high level expression while an empty node means low level 
expression. One of steering kernels is labelled for the transition. 



cussions""'^ '*' and Supplementary Information I, complete controll- 
ability of complex networks^ generally needs more steering nodes 
and its control affects the fuU state space (Figure lb), and thus are not 
suitable to study the transittability and to identify the steering kernel 
for state transitions. Although the recently developed control strat- 
egy of nonlinear systems'" is applicable to study the transittability, it 
needs to know the exact expression of nonlinear functions and para- 
meters in the model of complex systems, which is generally unavail- 
able in practice'. 

Instead of steering a directed network from any initial state to any 
desired state with the concept of complete controllability, transitt- 
ability concerns the ability to steer a directed network from one 
specific state to another specific state. Obviously if a directed network 
is completely controllable, it can be steered from one specific state to 
another specific state, which indicates that complete controllability is 
sufficient for transittability. However, complete controllability is not 
necessary for transittability and should even be avoided in practice. 
For example, when considering the transition from a disease pheno- 
type to a healthy phenotype, we expect to affect as a small state 
subspace as possible because side effects might be caused by unne- 
cessarily changing some nodes in a large state subspace. The state 
subspace affected by a control law can be measured by the generic 
dimension of controllable (sub) space. The GDCS for complete con- 
trollability is always the full dimensional state space while the GDCS 
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Table 3 | GDCS for complete controllability and transittability 

Network GDCS for complete controllability Phenotype Transitions GDCS for transittability 



p53- mediated DNA damage 1 7 
response network 

T helper differentiation cellular 1 7 
network 

Yeast cell cycle network 1 1 

EMT network 6 



for two state transittability is generally a small subspace of the full 
state space (see Tables 1 and 3). Therefore, in principle the control 
law based on the complete controllability affects states more than the 
one based on the transittability. In addition, as discussed in 
Supplementary Information I, a network cannot be guaranteed to 
transit from one specific state to any other state by acting the input 
control signals on only the minimum set of driver nodes identified by 
the minimum input theorem'. Furthermore, although theoretically 
the steering kernel could be a subset of the minimum driver nodes', 
the minimum input theorem' cannot be applied for efficiently find- 
ing the minimum number of steering nodes. Firstly, although the 
minimum number of driver nodes identified by minimum input 
theorem' is unique, the maximum matching for a given network is 
not unique. Actually finding all maximum matchings for a given 
network is an NP-hard problem. This means that there is no efficient 
way to find all possible sets of driver nodes. Secondly, by the min- 
imum input theorem', the minimum number of driver nodes is about 
0.8 n for regulatory networks with n nodes. All possible combina- 
tions of 0.8 n driver nodes is at least 2""", and thus it is computa- 
tionally prohibitive to exhaustively check all of them. 

In this paper, we have systematically studied the transittability of 
directed networks and proposed an algorithm to identify the steering 
kernel for transitions between two specific states. To bypass the needs 
of knowing the exact expression of nonlinear functions and para- 
meters in the complex systems, we have studied the transittability of 
directed networks with the concepts of structural linear systems and 
structural transittability. Our theoretical results provide the suf- 
ficient and necessary condition for determining the transittability, 
which is to check whether or not two GDCSs are equal. Although our 
theorems have been developed with continuous time-invariant linear 
systems, they can be directly applied to discrete time-invariant linear 
systems. Therefore, similar to the theorems'"*, our results remain 
unchanged even if the free parameters in a linear system are allowed 
to vary with time. That is, our theoretical results are applicable to 
time-varying linear systems. 

To identify the steering nodes for the transition between two 
states, we have developed a graph-theoretic algorithm by solving 
an optimal assignment problem of a weighted bipartite graph'^. 
Applying our algorithms to 27 complex networks we have found that 
the minimum numbers of steering nodes for transiting two states are 
less than those for complete controllability and the controllable 
spaces for transittability are smaller than those for complete controll- 
ability. Furthermore we have applied our algorithm to 4 regulatory 
biomolecular networks and found that the numbers of steering nodes 
for transiting two cellular phenotypes are small, which is greatly in 
agreement with empirical studies on these networks. In addition, 
majority of steering nodes found by our method have been already 
reported in existing empirical studies while other new steering nodes 
are potentially important in corresponding cellular phenotype tran- 
sitions. Therefore, we believe that our results also provide some 
fundamentals for understanding the mechanism of cellular pheno- 
type transitions, and as such, are expected to have implications for 
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network-based drug design. As can be seen, the theorems we 
developed in this study can directly be applied to any other complex 
networks, for example, social networks, power grids, food webs, the 
Internet, and electronic circuits^'^, just named a few. In this paper, we 
mainly focused on studying transittability with suitable input signals, 
and the implementation or design of the control input signals as well 
as analysis of the dependence between the size of steering kernel and 
the degree distribution of networks could be one direction of future 
work. 

Methods 

Network construction. The T helper differentiation cellular network, the EMT 
network, and the yeast cell cycle network are directly from the published 
references without any change, respectively. The P53 mediated DNA damage 
response network are constructed from the differential equations in the 
supplementary material of the paper^^. In such a construction, each variable in the 
differential equations corresponds to a node in the network. Node i regulates node j if 
the variable corresponding to node i appears in the right-handed side of the 
differential equation corresponding to node j. All self-loops corresponding to the 
degradations are excluded in the constructed network as their weights may not be free 
parameters. 

Calculating the GDCS and the size of the steering kernel. For a network G(A) and 
structural state Xq, assume that the structure system (A, Xq) is irreducible. Let S be a 
subset of nodes corresponding to non-zero components in Xq. Let us define a weighted 
graph G'{A) as follows: 1) associate the weight — 1 with every edge e of G(A); 2) 
add the edge e — VjVj and associate the weight — £ if e = ViVj is not in G(A) for v,gS; 
3) add the loop e — ViVi and associate the weight = 0 if e = VjVi is not in G(A) for 
V/^ S, where t: is a small positive number and less than 1/n for a network with n nodes. 
For simplicity, £ can take the value of 0.001, 0.0001, 0.00001 or the like. By solving an 
optimal assignment of a weighted bipartite graph representation of G ' (A), we can fmd 
the maximum weight circle partition of G'(A). Assume that the weight of the 
maximum circle partition is r + 5*£, where rands are integers and 5*£ < 1. Let the the 
number of source strong connected components of G{A). Then, from Supplementary 
Information IILC, we have GDCS{A, B) - GDCS{A, Bo) ^ r + s and the size of 
steering kernel is 5 + f. Note that the computational complexity of solving an optimal 
assignment of a weighted bipartite graph is 0{n^} according to reference^^ for the 
worst cases in which a network is a complete graph. For the sparse networks which are 
true in most cases, our computational complexity is less than O(n^). Actually Table S5 
and Figure S9 show that the our computational complexity is approximately 0{n^^^) 
for real complex networks with the number of nodes from 32 to 27772. 
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